Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries

نویسنده

  • Frances S. Turner
چکیده

The Illumina NexteraXT transposon protocol is a cost effective way to generate paired end libraries. However, the resulting insert size is highly sensitive to the concentration of DNA used, and the variation of insert sizes is often large. One consequence of this is some fragments may have an insert shorter than the length of a single read, particularly where the library is designed to produce overlapping paired end reads in order to produce longer continuous sequences. Such small insert sizes mean fewer longer reads, and also result in the presence of adapter at the end of the read. Here is presented a protocol to use publicly available tools to identify read pairs with small insert sizes and so likely to contain adapter, to check the sequence of the adapter, and remove adapter sequence from the reads. This protocol does not require a reference genome or prior knowledge of the sequence to be trimmed. Whilst the presence of fragments with small insert sizes may be a particular problem for NexteraXT libraries, the principle can be applied to any Illumina dataset in which the presence of such small inserts is suspected.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AlienTrimmer removes adapter oligonucleotides with high sensitivity in short-insert paired-end reads. Commentary on Turner (2014) Assessment of insert sizes and adapter content in FASTQ data from NexteraXT libraries

In a recent work, Turner (2014) compared the performances of two bioinformatics programs, cutadapt (Marcel, 2011) and AlienTrimmer (Criscuolo and Brisse, 2013), to trim off exogenous oligonucleotides from short-insert paired-end reads. Turner (2014) suggested that AlienTrimmer performed with very low sensitivity. Here we show that this reported lack of performance was due to inappropriate use o...

متن کامل

Multiple insert size paired-end sequencing for deconvolution of complex transcriptomes.

Deep sequencing of transcriptomes allows quantitative and qualitative analysis of many RNA species in a sample, with parallel comparison of expression levels, splicing variants, natural antisense transcripts, RNA editing and transcriptional start and stop sites the ideal goal. By computational modeling, we show how libraries of multiple insert sizes combined with strand-specific, paired-end (SS...

متن کامل

Tagmentation-Based Mapping (TagMap) of Mobile DNA Genomic Insertion Sites

Multiple methods have been introduced over the past 30 years to identify the genomic insertion sites of transposable elements and other DNA elements that integrate into genomes. However, each of these methods suffer from limitations that can frustrate attempts to map multiple insertions in a single genome and to map insertions in genomes of high complexity that contain extensive repetitive DNA....

متن کامل

Genome scaffolding with PE-contaminated mate-pair libraries

Scaffolding is often an essential step in a genome assembly process, in which contigs are ordered and oriented using read pairs from a combination of paired-ends libraries and longer-range mate-pair libraries. Although a simple idea, scaffolding is unfortunately hard to get right in practice. One source of problem is so-called PE-contamination in mate-pair libraries, in which a non-negligible f...

متن کامل

An Assessment of the Quality of Services in Public Libraries Using the Gap Analysis Model, Based on the Users' Perspective; study Case: Public Libraries Located in West Azarbayjan Province

Purpose. The main purpose of this study is to determine of services quality in West Azarbayjan public libraries using the gap analysis model, Based on the Users' Perspective.   Methodology. This study is an applied research that was conducted by survey Method. The population of research were members of West Azarbayjan public libraries that 450 people were randomly selected as samples. Last ve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014